Search CORE

47 research outputs found

Attribute-Guided Face Generation Using Conditional CycleGAN

Author: Lu Yongyi
Tai Yu-Wing
Tang Chi-Keung
Publication venue
Publication date: 14/11/2018
Field of study

We are interested in attribute-guided face generation: given a low-res face input image, an attribute vector that can be extracted from a high-res image (attribute image), our new method generates a high-res face image for the low-res input that satisfies the given attributes. To address this problem, we condition the CycleGAN and propose conditional CycleGAN, which is designed to 1) handle unpaired training data because the training low/high-res and high-res attribute images may not necessarily align with each other, and to 2) allow easy control of the appearance of the generated face via the input attributes. We demonstrate impressive results on the attribute-guided conditional CycleGAN, which can synthesize realistic face images with appearance easily controlled by user-supplied attributes (e.g., gender, makeup, hair color, eyeglasses). Using the attribute image as identity to produce the corresponding conditional vector and by incorporating a face verification network, the attribute-guided network becomes the identity-guided conditional CycleGAN which produces impressive and interesting results on identity transfer. We demonstrate three applications on identity-guided conditional CycleGAN: identity-preserving face superresolution, face swapping, and frontal face generation, which consistently show the advantage of our new method.Comment: ECCV 201

arXiv.org e-Print Archive

Crossref

Latent Embeddings for Collective Activity Recognition

Author: Hu Jian-Fang
Tang Yongyi
Zhang Peizhen
Zheng Wei-Shi
Publication venue
Publication date: 20/09/2017
Field of study

Rather than simply recognizing the action of a person individually, collective activity recognition aims to find out what a group of people is acting in a collective scene. Previ- ous state-of-the-art methods using hand-crafted potentials in conventional graphical model which can only define a limited range of relations. Thus, the complex structural de- pendencies among individuals involved in a collective sce- nario cannot be fully modeled. In this paper, we overcome these limitations by embedding latent variables into feature space and learning the feature mapping functions in a deep learning framework. The embeddings of latent variables build a global relation containing person-group interac- tions and richer contextual information by jointly modeling broader range of individuals. Besides, we assemble atten- tion mechanism during embedding for achieving more com- pact representations. We evaluate our method on three col- lective activity datasets, where we contribute a much larger dataset in this work. The proposed model has achieved clearly better performance as compared to the state-of-the- art methods in our experiments.Comment: 6pages, accepted by IEEE-AVSS201

arXiv.org e-Print Archive

Crossref

The Quality of Life of Patients with Colorectal Cancer and a Stoma in China: A Quantitative Cross-sectional Study

Author: Chen Yongyi
He Pingping
Li Xuying
Liu Huayun
Shen Boyong
Tang Xinhui
Wei Di
Xu Xianghua
Yu Juping
Zhu Xiaomei
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/06/2021
Field of study

University of South Wales Research Explorer

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

Author: Chen Jie-Neng
Landman Bennett A.
Liu Jie
Lu Yongyi
Tang Yucheng
Xiao Junfei
Yuan Yixuan
Yuille Alan
Zhang Yixiao
Zhou Zongwei
Publication venue
Publication date: 31/05/2023
Field of study

An increasing number of public datasets have shown a marked impact on automated organ segmentation and tumor detection. However, due to the small size and partially labeled problem of each dataset, as well as a limited investigation of diverse types of tumors, the resulting models are often limited to segmenting specific organs/tumors and ignore the semantics of anatomical structures, nor can they be extended to novel domains. To address these issues, we propose the CLIP-Driven Universal Model, which incorporates text embedding learned from Contrastive Language-Image Pre-training (CLIP) to segmentation models. This CLIP-based label encoding captures anatomical relationships, enabling the model to learn a structured feature embedding and segment 25 organs and 6 types of tumors. The proposed model is developed from an assembly of 14 datasets, using a total of 3,410 CT scans for training and then evaluated on 6,162 external CT scans from 3 additional datasets. We rank first on the Medical Segmentation Decathlon (MSD) public leaderboard and achieve state-of-the-art results on Beyond The Cranial Vault (BTCV). Additionally, the Universal Model is computationally more efficient (6x faster) compared with dataset-specific models, generalized better to CT scans from varying sites, and shows stronger transfer learning performance on novel tasks.Comment: Rank first in Medical Segmentation Decathlon (MSD) Competitio

arXiv.org e-Print Archive